Vision&Language papers and notes | home

Wu et al - Ask Me Anything Free-Form Visual Question Answering
Suo et al - 2021 - Proposal free One stage Referring Expression via Grid-Word Cross-Attention
Gao et al - 2021 - Chop Chop BERT Visual Question Answering by Chopping VisualBERT’s Heads
Gao et al - 2020 - Structured Multimodal Attentions for TextVQA
Wang et al - 2021 - Towards End to End Text Spotting in Natural Scenes
Li et al - 2019 - Show, Attend and Read A Simple and Strong Baseline for Irregular Text Recognition